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Recent results of imaging technologies and non-linear dynamics make possible to 
relate the structure and dynamics of functional brain networks to different mental 
tasks and to build theoretical models for the description and prediction of cognitive 
activity. Such models are non-linear dynamical descriptions of the interaction of the core 
components — brain modes — participating in a specific mental function. The dynamical 
images of different mental processes depend on their temporal features. The dynamics of 
many cognitive functions are transient. They are often observed as a chain of sequentially 
changing metastable states. A stable heteroclinic channel (SHC) consisting of a chain 
of saddles — metastable states — connected by unstable separatrices is a mathematical 
image for robust transients. In this paper we focus on hierarchical chunking dynamics 
that can represent several forms of transient cognitive activity. Chunking is a dynamical 
phenomenon that nature uses to perform information processing of long sequences by 
dividing them in shorter information items. Chunking, for example, makes more efficient 
the use of short-term memory by breaking up long strings of information (like in language 
where one can see the separation of a novel on chapters, paragraphs, sentences, and 
finally words). Chunking is important in many processes of perception, learning, and 
cognition in humans and animals. Based on anatomical information about the hierarchical 
organization of functional brain networks, we propose a cognitive network architecture 
that hierarchically chunks and super-chunks switching sequences of metastable states 
produced by winnerless competitive heteroclinic dynamics. 



Keywords: cognitive dynamics, stable heteroclinic channel, transient dynamics, low dimensionality of brain 
activity, hierarchical sequences, chunking and superchunking, cognition modeling principles 



INTRODUCTION 

Chunking is a dynamical phenomenon that the brain uses for pro- 
cessing long informational sequences. The concept of chunk was 
introduced by Miller (1956). His key notion is that short-term 
storage is not rigid but amenable to strategies such as chunk- 
ing that can expand its capacity. Miller's work drew plenty of 
attention to the concept of short-term memory and its functional 
characteristics. Chunking involves two processes: concatenation 
of units in a block and segmentation of the blocks. In general, 
chunking is related to the hierarchical organization of perceptual, 
cognitive, or behavioral sequential activity. In particular, in motor 
control (see Rosenbaum et al, 1983) sequences can consist of sub- 
sequences and these can in turn consist of sub-sub-sequences, etc. 
The natural hierarchical organization of long sequences is a result 
of the activity of specific brain functional networks. Such net- 
works include many different brain areas and some of them are 
also organized in a hierarchical manner. A well-known example 
is Broca's area that has been suggested to act as a "supramodal 
syntactic processor," able to process any type of hierarchically 
organized sequences (Grossman, 1980; Tettamanti and Weniger, 
2006), a hypothesis based on the findings that this region is not 
only involved in processing language syntax (Musso et al., 2003), 
but also in syntax like aspects of non-linguistic tasks, for exam- 
ple, the performance of specific movements and music (Fadiga 



et al, 2009) as several fMRI studies (Bahlmann et al, 2008, 2009) 
seem to confirm. Clerget et al. hypothesize that motor behav- 
ior shares some similarities with language (Clerget et al., 2013), 
namely that a complex action can be viewed as a chain of subordi- 
nate movements, which need to be combined according to certain 
rules in order to reach a given goal (Dehaene and Changeux, 1997; 
Dominey et al, 2003; Botvinick, 2008). 

What are the mechanisms that transform the extremely com- 
plex, noisy, and many- dimensional brain activity into a rather 
regular, low- dimensional, and even predictable cognitive behav- 
ior, e.g., what are the mechanisms underlying the dynamics of the 
mind, including chunking? This is one of the most challenging 
questions in today's neuro- and cognitive science. Recent con- 
tinuous advances in non-invasive brain imaging allow assessing 
the structural connectivity of the brain and the corresponding 
evolution of the spatio-temporal activity in detail. 

In our view, metastability is a key element of transient cog- 
nitive dynamics participating in chunking processes. The idea 
of the spatiotemporal organization of brain dynamic activity 
through transient, metastable states emerged more than 15 years 
ago (Kelso, 1995; Friston, 1997). According to this scenario, such 
dynamics can be represented as a sequential switching between 
different metastable states (for a description of the mathematical 
basis of this scenario see Rabinovich et al, 2008a,b). Metastable 
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transient dynamics represent a balance between the segregation 
of focused cognitive processing and the flexible integration of 
distributed brain areas. Such integration is necessary for the per- 
formance of a specific cognitive function (Bressler and Kelso, 
2001; Meehan and Bressler, 2012). The existence of connec- 
tions that are prevalent over long periods of time supports the 
well- regarded concept of a hierarchical organization of neural 
processing (Engel et al., 2001), which is the basis for the under- 
standing of the origin of the chunking dynamics. Because the 
dimensionality of cognition depends on the number of activated 
(in contrast to the potentially observable) metastable states, it 
is important to remember that the brain chooses the necessary 
metastable states and suppresses those which are irrelevant to the 
goal of the cognitive process, resulting in a reduced dimensional- 
ity. The low- dimensionality of brain cognitive dynamics is based 
on two important issues: first, the manner of the cognitive task 
encoding — an external or internal stimulus determining a spe- 
cific cognitive task excites a set of elements of the community 
networks which are responsible for the performance of such cog- 
nitive activities; and second, the existence of a specific hierarchical 
organization of the global brain networks that operate for the per- 
formance of a specific cognitive task by a moderate number of 
brain modes. 

Based on experimental data suggesting that the processing of 
sequential cognitive activity on computational grounds is imple- 
mented in the brain by spatiotemporally pattern dynamics (see 
also Sahin et al., 2009), we build here a general dynamical model 
that produces hierarchical chunking of sequences, which sug- 
gests a plausible neural mechanism of chunking dynamics in the 
brain. This model is reasonably low- dimensional, which allows a 
detailed dynamical analysis. 

MATERIALS AND METHODS 

A top-down approach to model transient cognitive dynamics tak- 
ing into account the experimental observations described in the 
introduction is to use kinetic equations for the description of spa- 
tiotemporal mental modes that contain the discussed metastable 
states as equilibrium points. The set of brain patterns that sequen- 
tially change in the process of the cognitive task performance 
determine the spatial structure of the modes and the associated 
connection matrix among them. Using such type of models we 
can integrate our knowledge about the description of brain activ- 
ity based on these new ideas related to heteroclinic sequences and 
their interactions, i.e., heteroclinic networks. 

As a top-down departing point, we need a mathematical object 
that can describe robust transient dynamics and their associated 
information processing. Once we have this object, we can imple- 
ment it through a set of canonic equations that can be used to 
study transient activity at different brain description levels, and in 
particular to address chunking dynamics. A mathematical image 
of robust transient sequential dynamics must have two principal 
features. First, it must be resistant to noise and reliable even in 
the context of small variations in initial conditions, so that the 
succession of states visited by the system (its trajectory, or tran- 
sient) is stable. Second, the transients must be input-specific to 
contain information about what caused them. These are two fun- 
damental contradictions regarding the use of transient dynamics 



for the description of brain activity. Transient dynamics are inher- 
ently unstable: any transient depends on initial conditions and 
cannot be reproduced from arbitrary initial conditions. On the 
other hand, dynamical robustness in principle prevents sensitivity 
to informative perturbations. These contradictions can be solved 
through the concept of metastability, which was introduced to 
cognitive science at the end of the last century (Kelso, 1995; 
Friston, 1997, 2000; Fingelkurts and Fingelkurts, 2006; Oullier 
and Kelso, 2006; Gros, 2007; Ito et al, 2007). 

A stable heteroclinic channel (SHC) is a mathematical object 
that meets the above discussed requirements, which can imple- 
ment such stable transients. A SHC is defined by a sequence 
of successive metastable "saddle" states that are connected by 
separatrices. Under proper conditions, all the trajectories in the 
neighborhood of these saddle metastable states that form the 
chain remain in the channel, ensuring robustness and repro- 
ducibility over a wide range of control parameters (Rabinovich 
et al, 2008b). The stability of a channel means that trajectories in 
the channel do not leave it until the end of the channel is reached. 

A simple model to implement SHCs is a generalized Lotka- 
Volterra equation with N interactive elements: 

dMt) ( ^ \ 

= Aj(t)F lea (S k ) - 9ijMt)\ +Mt)i\i(t) 

i = 1, N (1) 

where A{(t) > 0 is the activity rate of element i, Oj is the gain 
function that controls the impact of the stimulus, is an envi- 
ronmental stimulus, pij determines the interaction between the 
variables, r\i represents the noise level, and F is a function, in 
the simplest case a linear function. The state portrait of the sys- 
tem often contains a heteroclinic sequence linking saddle points. 
These saddles can be interpreted as successive and temporary 
winners in a never-ending competitive game, i.e., winnerless com- 
petition (WLC) dynamics (Rabinovich et al, 2001, 2006). In 
neural systems, because a representative model must produce 
sequences of connected neuronal population states (the saddle 
points), the neural connectivity py must be asymmetric, as deter- 
mined by the theoretical examination of this model (Huerta and 
Rabinovich, 2004). Although many connection statistics probably 
work for stable heteroclinic-type dynamics, it is likely that con- 
nectivity within biological networks is, to some extent at least, the 
result of optimization by evolution and synaptic plasticity. It is 
important to emphasize that Equation (1) is just an elementary 
building block for different levels of the chunking hierarchy that 
we will describe below. 

Models like the generalized Lotka-Volterra equations allow 
establishing the conditions necessary for transient stability, and 
display stable, sequential, and cyclic activation of its components, 
the simplest variant of WLC. A network with several degrees 
of freedom and asymmetric connections can generate struc- 
turally stable sequences — transients, each shaped by one input. 
Asymmetric inhibitory connectivity helps to solve the apparent 
paradox that sensitivity and reliability can coexist in a network 
(Huerta and Rabinovich, 2004; Nowotny and Rabinovich, 2007; 
Rabinovich et al, 2008b; Rabinovich and Varona, 2011). The 
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FIGURE 1 | Architecture of the three level cognitive network 
responsible for the grouping of informational items. Each level of 
hierarchy is described by its own Lotka-Volterra type Equations (see 2-6) 
with connection matrices p, j; and 5. Black circles represent inhibitory 
connections; triangles represent excitatory connections responsible for the 
choosing of the informational items. Spheres represent the informational 
items or units (metastable stables). Different colors indicate different 
chunks. All connections inside the elementary items are inhibitory. 



neurons or modes participating in a SHC are assigned by the 
stimulus, by virtue of their direct and/or indirect input from the 
neurons activated by that stimulus. The joint action of the exter- 
nal input and a stimulus-dependent connectivity matrix defines 
the stimulus-specific heteroclinic channel. In addition, asymmet- 
ric inhibition coordinates the sequential activity and keeps a 
heteroclinic channel stable. 

The WLC concept is directly related to the sequential dynam- 
ics of metastable states that are activated by inputs that do not 
destroy the origin of a competitive process. This paradigm can 
explain and predict many dynamical phenomena in neural net- 
works with excitatory and inhibitory synaptic connections. Based 
on the requirement of the stability, this formalism has been used 
(i) to assess the dynamical origin of finite working memory ( WM) 
capacity based upon WLC amongst available informational items 
(Bick and Rabinovich, 2009; Rabinovich et al., 2012); (ii) to build 
a dynamical model of information binding for transients that can 
describe the interaction of different sensory information flows 
that are generated concurrently (Rabinovich et al., 2010a); (hi) to 
model the sequential interaction between emotion and cognition 
(Rabinovich et al, 2010b); (iv) to represent attention dynam- 
ics (Rabinovich et al, 2013); and (v) to assess the dynamics of 
pathological states in mental disorders (Bystritsky et al., 2012; 
Rabinovich et al., 2013). Here we focus on a model of hierarchical 
chunking dynamics that can represent several forms of cognitive 
activity such as WM and speech construction. 

As we discussed in the Introduction, chunking is grouping 
or categorizing related issues or information into smaller, most 
meaningful and compact units. Think about how hard it would 
be to read a long review paper without chapters, subchapters, 
paragraphs, and separated sentences. Chunking is a naturally 
occurring process that can be actively used to break down prob- 
lems in order to think, understand, and make improvisation more 
efficiently. This is because it is easier to process chunked tasks or 
perceptional data. In particular, it is much easier to learn and 
recall such data. Mathematically, the "chunking principle" can 
be viewed as the transformation of a chain of metastable states 
along a transient process to the chain of groups of such states. 
It is a key dynamical idea that nature may use to make cognitive 
information processing more effective in the context of a complex 
environment. 

Chunking processes in human perception, learning, and per- 
formance of a cognitive task can be both automatic and directly 
linked to the environmental stimuli, and controllable by a goal- 
oriented intrinsic signal (Gobet et al, 2001). It is important to 
note that chunking is a strategy that supports increasing speed 
and accuracy through the formation of hierarchical memory 
structures and complex task- dependent behavioral sequences. 
Two competitive processes form temporal chunking sequences — 
one separates long sequences into shorter groups of information 
items to be easily performed, and the second connects them to 
express a long sequence as a unified thought or behavioral action 
(Friederici et al., 2011; Chekaf and Matha, 2012). 

Hierarchical chunking dynamics can be implemented in a 
model of cognitive networks whose information processing relies 
on SHCs. Figure 1 illustrates a chunking heteroclinic cognitive 
network for two hierarchical informational groups — elementary 



items and chunking (integrated) informational items including 
many elementary units interacting through dynamical connec- 
tions. It is reasonable to hypothesize that functionally there are 
two different cognitive networks from at least two different hier- 
archical levels that are responsible for the: (i) organization of the 
sequence of items inside chunks, and (ii) the formation of the 
chunk sequence. In particular, this hypothesis is supported by an 
experiment with chunking during visuomotor sequence learning 
(Sakai et al, 2003). It has been shown that each motor cluster is 
processed as a single memory unit — a chunk. A learned visuo- 
motor sequence is a sequence of chunks that contains several 
elementary movements. The authors of this work have shown 
that a key role in the process of chunking formation is played by 
a brain network including the dominant parietal area, the basal 
ganglia, and the presupplementary motor area (see also Ribas- 
Fernandes et al, 2011 and Bor and Seth, 2012, where authors 
discuss the chunking structure of conscious processes). 

Below we suggest a three level hierarchical model for the 
description of the chunking dynamics. Inhibition plays a key role 
in this model as is responsible for the execution of three functions: 
(i) competition between elementary informational items in order 
to produce stable sequences of metastable states, (ii) generation 
of the chunking sequence, and (hi) control of the performance 
of the sequential task. In recent years, the investigation of the 
hierarchical control between different levels of representation and 
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information processing has become one of the hot subjects in cog- 
nitive science. This issue is important for understanding how the 
mind controls behavior and itself. In particular, the relationship 
between chunking (a sequence-level process) and task-set inhibi- 
tion (a task-level process) in the performance of task sequences 
was investigated in (Koch et al, 2006; Schneider, 2007; Li et al, 
2010), for a description of "chunks of chunks" — "superchunks" 
see Rosenberg and Feigenson (2013). 

To understand the emergence of hierarchical chunking dynam- 
ics in a model we need to depart from Equation (1) in the 
following direction, c.f. Figure 1): 



Xf = X lk Jaf (S, C) • Y lk - pf ( 5 > Q x f 
xY lk = Y lk ^ - p(Q J^xf j - Z lk ^ 

M 

e(C)z lk = ^/ m ( 5 > c ) ylm - zlk 

m 

T V 1 = V 1 ( U - 8(C) J2 y1 *\ - 
p 

©(C)w' = J2s' q ( s > °> vq - w ' 



(2) 



(3) 



(4) 



(5) 



(6) 



Here X lk characterizes the -th informational item associated with 
the k-th chunk and /-th superchunk, of(S, C) is the growth rate 
for each informational item determined by the stimulus S and 
the cognitive task C, and p l J-(S, C) is the matrix of inhibitory 

connections among basic informational items. In this model Y lk 
characterizes the k-th chunk associated to the /-th superchunk 
V , with corresponding characteristic times x and T, respectively, 
and P(C) represents the strength of the inhibition between the 
informational items and the chunk, and 8(C) between the chunks 
and the superchunk. Also, Z lk describes the synaptic dynamics for 
the k-th chunk associated to the /-th superchunk with % km (S, C), 
the matrix of inhibitory connections between chunks (black cir- 
cles in Figure 1); and W l describes the synaptic dynamics for 
the /-th superchunk with q lq (S, C), the matrix of inhibitory con- 
nections between superchunks, the corresponding characteristic 
times are 0(C) and 0(C). In this model, |3(C) and 8(C) are adap- 
tation parameters that determine the timing relationship between 
a basic informational chain and the chunking and superchunking 
modulation. The chunking variables also satisfy the generalized 
Lotka-Volterra — canonic equations which allows them to form a 
stable sequence. Because of this, in fact, chunking variables play 
the role of cognitive controllers. The parameters for Equations 
(3)-(5) in the simulations below were chosen with this scope. 
Since chunking dynamics has to take into account of the char- 
acteristic time of the chunk formation, the competition between 
different chunks has to be delayed — we used for this an inhibi- 
tion described by a first order kinetic model. At the same time, 



the competition among elementary informational items is imple- 
mented by fixed weight instantaneous synapses. The same logic 
has been applied for the description of the highest level of the 
hierarchy — the superchunks. 

RESULTS: HIERARCHICAL SEQUENCES— CHUNKING AND 
SUPER-CHUNKING 

Let us first represent the phase portrait of a simple two -level 
chunking dynamics. We carried out numerical simulations of 
the model for the dynamics within chunks of informational 
items for the following parameters N k = 3, M = 3 (num- 
ber of "chunks" or "episodes"), a 1 = [7.24, 5.85, 8.30], a 2 = 
[9.93,6.00,5.18], a 3 = [8.29,7.86,9.16], and given these val- 



ues, p£ 



to, 9 l_. n = 

.,N lk ,k = 1, 



+ 0.51, and pf 



0.5, 



i = 1, . . . , AT f/c , k = 1, . . . , M as well as the parameters consid- 
ered for the synaptic dynamics described by Equations (3) and 
(4):t = 0.7,0 = 2.0,^ = 1.0, 1 = 1.4 and % k n k n-i = 0 .5, 
k = 1, . . . , M and |3 = 0.01. The results of these simulations are 
shown in Figures 2, 3. 

Figure 2 shows the phase portrait of the chunking dynamics 
when the superchunk formation is absent: the system is described 
by Equations (2)-(4), V" = 1. This example illustrates a closed 
chunking sequence (green) that consists of several heteroclinic 
cycles that represent the elementary chunks (blue). In general, the 
number of elementary items in each chunk are different and the 
chunking sequence can be open. 

Figure 3 illustrates the timing between chunks along the 
sequence. The emergence of the chunking sequence shown in 
Figure 2 is the result of a modulational instability in the two -level 
hierarchical network whose dynamics is described by Equations 
(2)-(4). This instability is oscillatory. The characteristic period of 
the oscillation is AT. The analytical investigation of the depen- 
dence of AT on the control parameters t, 0, |3 and connection 
matrices p, ^ is a non-realistic problem because of the non-linear 
feedback between the dynamical variables X and Y. However, it 
is reasonable to think that the key parameter in this problem is 




FIGURE 2 I The projection of a nine-dimensional phase portrait of a 
two-level chunking hierarchical dynamics in the space of the 
three-dimensional auxiliary variables [see the Equations (2)-(4)] 
= V 1 + 0.04 • (X] + Xf + Xf), J 2 = Y 2 + 0.04 . (X 2 1 + Xf + Xf ), 

J 3 = Y 3 + 0.04 • (X 3 1 + X| + X|). Blue represents the elementary 
informational item activity — individual chunk. Green represents the 
chunking sequence. 
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FIGURE 3 | The dependence of the chunking interval timing [see 


Equation (1)] on the control parameter p. One can see that the chunking 


interval strongly decreases together with the increasing of the adaptation 
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FIGURE 4 | Time series of the sequences of the three-level 
hierarchy— 108 items groupped in 18 chunks of 6 items; these chunks 
form 3 superchunks of 6 elements each displaying reproducible 
dynamics according to the model (2)-(6). Different colors correspond to 
different items inside each group (switching the color means moving from 
the previous item to the next one). 



P which determines the level of excitability of variable Y and, 
according to the feedback, also controls the excitability of X (term 
of(S, C) • X lk • Y lk ) in the right hand side of Equation (2). In 
Figure 3 we represent the numerical analysis of the dependence 
of AT on the parameter P — increasing p, i.e., decreasing the 
excitability leads to the decreasing of the timing interval AT. 

We also carried out numerical simulations of a high- 
dimensional model that describes the dynamics of chunk 
and super-chunk formation with the following parame- 
ters: N lk = 6, M l = 6 (number of chunks), P = 3 (number 
of superchunks), a 11 = [6.94, 5.11, 8.94, 5.86, 8.33, 9.62], 
a 12 = [5.48, 5.66, 5.39, 9.89, 9.99, 5.82], o 13 = [7.65, 8.98, 9.21, 
6.02, 5.71, 5.12], o 14 = [7.61, 7.73, 5.62, 7.93, 5.80, 5.39], o B = 
[5.11, 9.99, 5.52, 5.66, 5.50, 8.21], o 16 = [5.84, 9.39, 7.08, 5.16, 8 

a lk 

.37, 6.87], and given these values, pjj = 1.0, pf _ , { = + 0.5 

a lk 

1, pf . = - 0.5, i = 1, . . . , N lk y k = 1, . . . , M l , 1= 1, . . . 

„ lU „ af-af_ 

,P, and p|? =p£_ iiw H ^^-+2, i # {*„_ i, i„, i„ + i}, as 

in 

well as the parameters considered for the synaptic dynam- 
ics between chunks described by the equations x = 0.8, 
0 = 2.0, %f = 1.0, %\ nkn - 1 = 0.5, %\ nkn + l = 1.4, %\ nkn + l = 1.3, 
i = L5j /c — 1, . . . , M\ I = 1, . . . , P, %f n = %\ n - lkn +2, 
k ^ {k n -\, k n , /c w + i}, and P = 0.01. Finally, the parameters 
for the synaptic dynamics between superchunks were T = 5, 
@ = 10, q 11 = 1.0, s^"- 1 = 0.5, + 1 = 1.4, / = 1, . . . , P, 
and 8 = 0.01. The result of these simulations are displayed in 
Figure 4, which shows three levels of information hierarchy: 
original informational chain (lower panel), chunked chain 
(middle panel), and superchunking chain (upper panel). 

As illustrated in Figure 2, the sequence of chunks can be con- 
sidered as a heteroclinic cycle of metastable states where each 
metastable state itself is a heteroclinic cycle of elementary infor- 
mational items. Based on this self- similarity, we can expect that 



the chunking chain as a result of a second heteroclinic insta- 
bility generates the next level of modulation — the superchunk 
sequence. Our expectation is confirmed in Figure 4 that shows 
the time series of the three level network (2)-(6) (c.f. Figure 1) 
dynamics. In this figure, one can see the generation of sequences 
of superchunks. All together, the sequences informational items, 
chunks and superchunks can be interpreted as "words," "sen- 
tences," and "paragraphs." 

For the sake of simplicity we have illustrated here the phe- 
nomenon of stability just for a closed-loop clustered chunking- 
superchunking sequence. In the general case of open sequence, it 
is possible to formulate the sufficient conditions for the existence 
and stability of the non- closed channel based on the estima- 
tion of the saddle values of the metastable states (elementary 
items) — the channel is stable in the case that all of them are 
larger than one in absolute value (Afraimovich et al., 2004; Bick 
and Rabinovich, 2010). The formulation of the necessary condi- 
tions is a more complex problem and is still under consideration. 
The imposed stability conditions determine the behavior of the 
trajectories inside the neighborhood of the heteroclinic network 
independently of the initial conditions as computer experiments 
have confirmed (Afraimovich et al., 2004; Bick and Rabinovich, 
2010). 

The above described numerical results can be justified by an 
analytical study of the system 

• xY k = Y k - $ J2 X k - Z k ^J, O) 

M 

Q Z k = ^2 ^ km Y m — Z k 
m = 1 
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i = 1, . . . , N k y k = 1, . . . , M. For the sake of simplicity, let us 
assume that x = 0 << 1, so one can apply geometric singu- 
lar perturbation theory (see, for instance, Jones, 1995; Hek, 
2010 and references therein). In order to avoid confusion, it is 
important to say that the assumption x = 0 < < 1 implies that, 
in contrast to the dynamics of X, the chunking dynamics is a 
composition of fast and slow motions. The fast motions lead 
variables 7-th and Z-th to a neighborhood of the slow mani- 
fold in the phase space. The evolution of the chunk variables 
on this manifold in the vicinity of the metastable states is much 
slower than the X variables. This corresponds to the intuitively 
clear fact that the "enveloping" variables mimic the averaging 



dynamics of X Computer experiments confirm this explanation 
(see Figure 4). 

The limit slow manifold has the equations 

Y k (l - P Ef= i x i ~ zk ) = 0> Em= i ^ W ^ W -Z k = 0, thus, 

Y!^= x % km Y m = 1 - p Efi i X*. Denote by % the 
m x m-matrix % km . If det % ^ 0, we find 

/ M M N m \ 

\m = 1 m = 1 i=l J 



Table 1 | Sequential dynamics in neural and cognitive systems. 



Phenomenon/image 


Model 


References 


Comments 


Voting paradox / Structurally stable 
heteroclinic cycle 


Kinetic (rate) equation, 
Lotka-Volterra model 


Krupa, 1997; Stone and Armbruster, 
1999; Ashwin et al., 2003; 
Postlethwaite and Dawes, 2005 


J. C. Borda and the Marquis de 
Condorcet (De Borda, 1781 ; Saari, 
1995) analyzed the process of 
plurality elections at the French 
Royal Academy of Sciences. They 
predicted the absence of a winner 
in a 3 step voting process 
(Condorcet's triangle) 


Learning sequences 


Hopfield type non-symmetric 
networks with time delay including 
spiking neuron models 


Amah, 1972; Kleinfeld, 1986; 
Sompolinsky and Kanter, 1986; 
Minai and Levy, 1993; Deco and 
Rolls, 2005 


Networks proposed to explain the 
generation of rhythmic motor 
patterns and the recognition and 
recall of sequences 


Latching dynamics 


Potts network is able to hop from 

Ul IU UloOIUlU dlllclL/lUI LU dllULIIUI 

under random perturbation to 
make a sequence 


Treves, 2005; Russo et al., 2008; 

Ri iccn anH Tro\/oc 701 1 1 1 inL/orhanH 
nUooU dl IU II cvco, ZU 1 1 , l_ll ll\t!l 1 Idl IU 

and Gros, 2013 


The dynamics can involve 

com lonroc r\f r*r\n+im 101 icl\/ latr^hirm 

ouquuiiL/co ui uui i li 1 1 uuubiy idiuiiiiiy 

transient states 


Sequential memory with synaptic 
dynamics / Chaotic itinerancy 
sequences of Milnor attractors or 
attractor ruins 


Spike-frequency-adaptation 
mechanism Noisy dynamical 
systems. Cantor coding 


Tsuda, 2009 


Proposed to be involved in episodic 
memory and itinerant process of 
cognition 


Winnerless sequential switchings 
along metastable states/Stable 
heteroclinic channel 


Generalized coupled 
Lotka-Volterra equations 


Afraimovich et al., 2004; Rabinovich 
etal., 2008a,b 


Information processing with 
transient dynamics at many 
different description levels from 
simple networks to cognitive 
processes 


Winnerless competitive dynamics 
in spiking brain networks 


Random inhibitory networks of 
spiking neurons in the striatum 


Ponzi and Wickens, 2010 


Neurons form assemblies that fire 
in sequential coherent episodes 
and display complex 
identity-temporal spiking patterns 
even when cortical excitation is 
constant or fluctuating noisily 


Sequences of sequences / 
Hierarchical transient sequences 


Recognition of sequence of 
sequences based on a continuous 
dynamical model 


Kiebel et al., 2009 


Speech can be considered as a 
sequence of sequences and can be 
implemented robustly by a 



dynamical model based on 
Bayesian inference, recognition 
dynamics disclose inference at 
multiple time scales 
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where t] is the co factor of the entry i; of the matrix i=. 
Substituting this expression into the first equation of the system 
(7) we obtain the system 

( M N k ft M N k \ 

s m=l j'= 1 S m=l i=l J 

(9) 

i = 1, . . . , N*, /c = 1, . . . , M, which is similar to the binding 
model described in Rabinovich et al. (2010a). In particular, the 
"in- chunk" dynamics in (9) corresponds to the dynamics in the 
modality subspace in Rabinovich et al. (2010a). The main pecu- 
liarity of the system (9) is that the rates of coupling coefficients 
between different chunks have the common factor p, so if |3 = 0 
then the interaction between different chunks is absent. Similarly 
to the study in Rabinovich et al. (2010a), one can impose condi- 
tions under which there exists a heteroclinic cycle for each chunk 
and successive heteroclinic connections between saddle points 
in different cycles. The last claim has the form p > p cr where 
P cr depends on the parameters of the system (9). If x is small 
then because of the geometric singular perturbation theory, the 
imposed conditions shall guarantee the existence of a hetero- 
clinic network in the original system (7) corresponding to the 
"in- chunk" and "inter-chunk" dynamics. 

Observations on the temporal chunk signal have focused on 
the use of pauses in behavior to probe chunk structures in WM. 
On the basis of some of these studies, a hierarchical process 
model has been proposed, which consists of four hierarchical lev- 
els describing different kind of pauses. The lowest level consists 
of pauses between strokes within letters. On higher levels, there 
are pauses between letters, words, and phrases. Each level is asso- 
ciated with a larger amount of processing when retrieving these 
chunks from memory (Cheng and Rojas-Anaya, 2006). Writing 
maybe an effective approach to the study of cognitive phenomena 
that involves the processing of chunks. In Cheng and Rojas-Anaya 
(2003), it was demonstrated that in the writing of simple number 
sequences the duration of pauses between written elements (dig- 
its) that are within a chunk are shorter than the pauses between 
elements across the boundary of chunks. This temporal signal is 
apparent in un-aggregated data for individual participants in sin- 
gle trials. Mathematically the time intervals between chunks and 
super-chunks are controlled by parameter p (see Equation 3). 

DISCUSSION 

In this paper we have shown how the architecture of hierarchi- 
cal mental model networks affected their associated functions. 
The discussed examples illustrate that networks with metastable 
states having several unstable separatrices exhibit very diverse 
cognitive functions (behavior). Complex heteroclinic networks 
allow completely new dynamical phenomena, and one of the pri- 
mary challenges is the assessment of the existence and stability 
of hierarchical — chunking processes that can represent cognitive 
activity. 

It is important to remind that the modeling of cycling and 
sequential dynamics in behavior and cognition has a long his- 
tory (see several representative efforts in Table 1). Most of these 



models are based on Hopfield type networks. The main problem 
there is to keep the stability of the recall sequences against noise. 

The results of chunking dynamics reported in this paper can be 
viewed as relevant in the description of different cognitive tasks. 
For example, in WM, humans encode items and synthesize them. 
With that, we give meaning to ideas and find a relevant place 
for them in our cognitive world. In these actions the interaction 
between WM and chunking are reciprocal — first of all WM is the 
"engine" of chunking, and on the other hand, the chunking makes 
WM capacity higher. 

The model of chunking dynamics discussed in this paper relies 
on heteroclinic dynamics. It is important to emphasize that the 
main features of the SHC do not depend on the specific model 
used. The conditions of existence and the dynamical features of 
SHCs can be implemented in a wide variety of models: from 
simple Lotka-Volterra descriptions to complex Hodgkin-Huxley 
models, and from small networks to large ensembles of many ele- 
ments (Varona et al., 2002; Venaille et al., 2005; Nowotny and 
Rabinovich, 2007; Rabinovich et al., 2012). The intrinsic hier- 
archical nature of the SHC at different temporal and spatial 
scales allows implementing many types of cognitive dynamics. 
Within this framework, brain networks can be viewed as non- 
equilibrium systems and their associated computations as unique 
patterns of transient activity, controlled by incoming input. The 
results of these computations can be reproducible, robust against 
noise, and easily decoded. Using asymmetric inhibition appro- 
priately, the space of possible states of large neural systems can 
be restricted to connected saddle points, forming SHCs. These 
channels can be thought of as underlying reliable transient brain 
dynamics. Table 2 summarizes four types of heteroclinic net- 
works that can describe different aspects of sequential dynamics 
in cognitive processes: (i) A canonic heteroclinic network that 
produces reproducible sequential switching from one metastable 
state to another inside one modality (like in a simple WM task); 
(ii) A network displaying inhibitory-based heteroclinic binding 
dynamics that is responsible for the stable perception of a subject 
based on three different modalities; (hi) Two different modalities 
dynamically coordinated by excitatory connections; (iv) A chunk- 
ing heteroclinic network that controls the grouping of elements of 
sequential behavior. 

Mathy and Feldman have recently suggested to use the 
Kolmogorov complexity and compressibility (Mathy and 
Feldman, 2012) for the definition of a "chunk": a chunk is a unit 
in a maximally compressed code. The authors presented a series 
of experiments in which they manipulated the compressibility of 
stimulus sequences by introducing sequential patterns of variable 
length. To explore the influence of chunking on the capacity 
limits of WM, and departing from Bick and Rabinovich (2009), 
authors in Li et al. (2013) have suggested a model for chunking in 
sequential WM. This model also uses hierarchical bidirectional 
inhibition- connected neural networks with WLC. Assuming no 
interaction between a basic sequence and a chunked sequence, 
and the existence of an upper bound to the inhibitory weights the 
network, authors show that chunking increases the number of 
memorized items in WM from the "magical number" 7-16 items. 
The optimal number of chunks and the number of the memorized 
items in each chunk correspond to the "magical number 4." 
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Table 2 | Heteroclinics in mind. 



Phenomenon 



Network formalism* 



Phase portrait 



Time series 



Sequential heteroclinic 
switching 



Xj - Xj [a, - Pij x j 

/=1 





(N L N 
<*' - E p'ijXj -EE %fx^ 
a _ _ ,. . . . 7=1 m=1/=1 

flow 



Heteroclinic cooperation ty"AJ" = X^ - " • 



y=1 k=\\=\ 




Hierarchical chunking 
memory and learning 



N k 



zY k = Y k (n -p£x;A ~ z/ 



m=1 












k lk in: 



0 20 40 60 80 100 120 140 160 
time 



*See the definition of the variables and parameters in the text. 



Recent experiments have confirmed the existence of three lev- 
els of cognitive hierarchy — see Rosenberg and Feigenson (2013). 
In this paper authors reported that infants can unify the represen- 
tation of chunks into "super- chunks" 

The chunking models discussed above can be generalized on 
more complex cases. In particular, by adding attention control 
in the network hierarchy, it is possible to analyze the bind- 
ing of sequences of chunks. The brain could use such binding 
to perform many cognitive functions like the coordination of 
visual perception with speech comprehension, or the coordina- 
tion of music chunks and word chunks in singing processes. It 
is well-known that viewing a speaker's articulatory movements 
substantially improves a listener's ability to understand spoken 
words, especially under noisy environmental conditions like in 
a crowded cocktail party. Ross and coauthors claimed that this 
effect is most pronounced when the auditory input is weakest. As 
a result of attentional binding — multisensory integration — , sub- 
stantial gain in multisensory speech enhancement is achieved at 
even the lowest signal-to noise ratios (Ross et al, 2007). 

The dynamics of hierarchical heteroclinic networks is also 
able to explain and predict the coordination of behavioral ele- 
ments with different time scales (for a study about the coordi- 
nation of sensorimotor dynamics see Jantzen and Kelso, 2007). 
Functionally, such kind of synchronization can be the result of 



learning — the changing of the strength of inhibitory connections 
between agents at the different levels of the hierarchy in order to 
coordinate the dynamics with different time scales (see Figure 3). 
Additionally, it is important to note that the winnerless competi- 
tive learning process itself can be chaotic (Komarov et al., 2010), 
which provides wider possibilities for adaptability. 
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